List of AI News about Selective Gradient Masking
| Time | Details |
|---|---|
|
2025-12-09 19:47 |
Anthropic Unveils Selective Gradient Masking (SGTM) for Isolating High-Risk AI Knowledge
According to Anthropic (@AnthropicAI), the Anthropic Fellows Program has introduced Selective GradienT Masking (SGTM), a new AI training technique that enables developers to isolate high-risk knowledge, such as information about dangerous weapons, within a confined set of model parameters. This approach allows for the targeted removal of sensitive knowledge without significantly impairing the model's overall performance, offering a practical solution for safer AI deployment in regulated industries and reducing downstream risks (source: AnthropicAI Twitter, Dec 9, 2025). |
|
2025-12-09 19:47 |
SGTM: Selective Gradient Masking Enables Safer AI by Splitting Model Weights for High-Risk Deployments
According to Anthropic (@AnthropicAI), the Selective Gradient Masking (SGTM) technique divides a model’s weights into 'retain' and 'forget' subsets during pretraining, intentionally guiding sensitive or high-risk knowledge into the 'forget' subset. Before deployment in high-risk environments, this subset can be removed, reducing the risk of unintended outputs or misuse. This approach provides a practical solution for organizations seeking to deploy advanced AI models with granular control over sensitive knowledge, addressing compliance and safety requirements in regulated industries. Source: alignment.anthropic.com/2025/selective-gradient-masking/ |